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Abstract 



o 

Oh 

^ Given a large, high-dimensional sample from a spiked population, the top sample 

covariance eigenvalue is known to exhibit a phase transition. We show that the largest 
eigenvalues have asymptotic distributions near the phase transition in the rank one 
spiked real Wishart setting and its general /3 analogue, proving a conjecture of Baik, 
Ben Arous and Peche (2005). We also treat shifted mean Gaussian orthogonal and 
/3 ensembles. Such results are entirely new in the real case; in the complex case we 
strengthen existing results by providing optimal scaling assumptions. One obtains 
the known limiting random Schrodinger operator on the half-line, but the boundary 
condition now depends on the perturbation. We derive several characterizations of 
CN the limit laws in which appears as a parameter, including a simple linear boundary 

£*»« value problem. This PDE description recovers known explicit formulas at f3 = 2, 4, 

yielding in particular a new and simple proof of the Painleve representations for these 
t-h Tracy- Widom distributions. 
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1 Introduction 



The study of sample covariance matrices is the oldest random matrix theory, predating 
Wigner's introduction of the Gaussian ensembles into physics by nearly three decades. Given 
a sample Xi, . . . , X n e IR P drawn from a large, centred population, form the pxn data matrix 
X = [Xx . . . X n ] ; the p x p matrix S = XX* plays a central role in multivariate statistical 
analysis (Muirhead 1982, Bai 1999, Anderson 2003). The distribution in the i.i.d. Gaussian 
case is named after Wishart who computed the density in 1928. The classical story is that of 
the consistency of the sample covariance matrix -S as an estimator of the population 
covariance matrix S = EAjX/ when the dimension p is fixed and the sample size n 
becomes large. The law of large numbers already gives — > S. In this fixed dimensional 
setting, the eigenvalues Ai > • • • > X p of S produce consistent estimators of the eigenvalues 
4 > ' • • > ^ of S: for example, the sample eigenvalue ^-A^ tends almost surely to 
the population eigenvalue as n — > oo, with Gaussian fluctuations on the order rr l l 2 
(Anderson 1963). The same holds in the complex case Xi e C p . 

Contemporary problems typically involve high dimensional data, meaning that p is 
large as well — perhaps on the same order as n or even larger. In this setting, say with null 
covariance £ = J, the sample eigenvalues may no longer concentrate around the population 
eigenvalue 1 but rather spread out over a certain compact interval. Ifp/n — > c with < c < 1, 
Marcenko and Pastur (1967) proved that a.s. the empirical spectral distribution - ^2 k 8\ k / n 
converges weakly to the continuous distribution with density 



where a = (1 — a/c) 2 and b = (1 + \fc] 2 ■ (The singular case c > 1 is similar by the obvious 
duality between n and p, except that the p — n zero eigenvalues become an atom at zero of 
mass 1 — c _1 .) This Marcenko-Pastur law is the analogue of Wigner's semicircle law in 
this setting of multiplicative rather than additive symmetrization (see also Silverstein and 
Bai 1995). The assumption of Gaussian entries may be significantly relaxed. 

Often one is primarily interested in the largest eigenvalues, as for example in the widely 
practiced statistical method of principal components analysis. Here the goal is a good low- 
dimensional projection of a high- dimensional data set, i.e. one that captures most of the 
variance; the structure of the significant trends and correlations is estimated using the largest 
sample eigenvalues and their eigenvectors. The challenge is to determine which observed 
eigenvalues actually represent structure in the population, and understanding the behaviour 
in the null case is therefore an essential first step. 

In the null case the first-order behaviour is simple: ^A^ — > b a.s. for each fixed k as 
n — > oo, i.e. none have limits beyond the edge of the support of the limiting spectral distri- 




2 



bution (Geman 1980, Yin, Bai and Krishnaiah 1988). More interestingly, the fluctuations 
are no longer asymptotically Gaussian but are rather those now recognized as universal at 
a real symmetric or Hermitian random matrix soft edge: they are on the order n~ 2//3 , 
asymptotically distributed according to the appropriate Tracy- Widom law. The latter 
were introduced by Tracy and Widom (1994, 1996) as limiting largest eigenvalue distribu- 
tions for the Gaussian ensembles (see also Forrester 1993) and have since been found to 
occur in diverse probabilistic models. The limit theorems for sample covariance matrices 
were proved by Johansson (2000) in the complex case and by Johnstone (2001) in the real 
case (see Soshnikov 2002 for the first universality results here). Restrictions c 7^ 0, 00 on the 
limiting dimensional ratio were removed by El Karoui (2003) (see also Peche 2009). 

Motivated by principal components analysis, it is natural to study the behaviour of 
the largest sample eigenvalues when the population covariance is not null but rather has 
a few trends or correlations. Johnstone (2001) proposed the spiked population model 
in which all but a fixed finite number of population eigenvalues (the spikes) are taken to 
be 1 as n,p become large. Baik, Ben Arous and Peche (2005) (BBP) analyzed the spiked 
complex Wishart model and discovered a very interesting phenomenon: a phase transition 
in the asymptotic behaviour of the largest sample eigenvalue as a function of the spikes. 
We restrict attention to the case of a single spike in the present chapter, setting £\ = £, 
h = h = ■ ■ ■ = 1. 

In this rank one perturbed case, BBP describe three distinct regimes. Assume that 
p/n = 7 2 is compactly contained in (0, 1]. If £ nyP is in compactly contained in (0, 1 + 7) then 
the behaviour of the top eigenvalue is exactly the same as in the null case: 

p (^j^- 2/3 - o- + ^) 2 ) ^ x ) -+ F ^ 

where F 2 is the Tracy- Widom law for the top GUE eigenvalue. This is the subcritical 
regime. If £ H)P is compactly contained in (1 + 7, 00) then the top eigenvalue separates from 
the bulk and has Gaussian fluctuations on the order n v l 2 : 

p((^ 2 -7 2 ^)~ 1/2 - 1/2 (^-(^Vry)) <x^^=j\~^dt. 

This is the supercritical regime. Finally there is a one-parameter family of critical scal- 
ings in which £ njp — (1 + 7) is on the order n -1 / 3 ; these double scaling limits are tuned so 
that the fluctuations — which are on the order n~ 2//3 as in the subcritical case — are asymp- 
totically given by a certain one-parameter family of deformations of F 2 . We refer the reader 
to the original work for details. Subsequent work includes a treatment of the singular case 
p > n along the same lines (Onatski 2008), deeper investigations into the limiting kernels 
(Desrosiers and Forrester 2006), and generalizations beyond the spiked model (El Karoui 
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2007) and away from Gaussianity (Bai and Yao 2008, Feral and Peche 2009). BBP con- 
jectured a similar phase transition for spiked real Wishart matrices, in the sense that all 
scalings should be the same but the limiting distributions would be different. 

Now often referred to as the BBP transition, this picture is relevant in various applica- 
tions. Within mathematics it has been applied to the TASEP model of interacting particles 
on the line (Ben Arous and Corwin 2011). Spiked complex Wishart matrices occur in prob- 
lems in wireless communications (Telatar 1999). With these two exceptions, however, most 
applications involve data that are real rather than complex. They include economics and 
finance — Harding (2008) used the phase transition to explain an old standard example of the 
failure of PCA — and medical and population genetics — Patterson, Price and Reich (2006) 
discuss its role in attempting to answer such questions as "Given genotype data, is it from 
a homogeneous population?" Further applications include speech recognition, statistical 
learning and the physics of mixtures (see Johnstone 2007, Paul 2007, Feral and Peche 2009 
for references). In general, asymptotic distributions in the non-null cases are relevant when 
evaluating the power of a statistical test (Johnstone 2007). 

Despite these developments, the conjectured BBP picture for spiked real Wishart matrices 
has proven elusive even in the rank one case. The difficulty is with the joint eigenvalue 
density: The complex case involves an integral over the unitary group that BBP analyzed 
via the Harish-Chandra-Itzykson-Zuber integral, a tool originating in representation theory 
that appears to have no straightforward analogue over the orthogonal group. Much is known, 
however. At the level of a law of large numbers, the phase transition is described by Baik 
and Silverstein (2006); a related separation phenomenon was observed already by Bai and 
Silverstein (1998, 1999). A broad generalization of the results on a.s. limits is developed 
by Benaych-Georges and Nadakuditi (2009) and dubbed "spiked free probability theory". 
Paul (2007), Bai and Yao (2008) prove Gaussian central limit theorems in the supercritical 
regime. Feral and Peche (2009) prove Tracy- Widom fluctuations in the subcritical regime 
under the scaling assumptions of BBP, Interestingly, Wang (2008) obtained a critical limiting 
distribution for certain rank one spiked quaternion Wishart matrices. 

It remains to obtain the asymptotic behaviour in the critically spiked regime around 
the phase transition in the real case. We do so here, establishing the existence of limiting 
distributions under the scalings conjectured by BBP and characterizing the laws. Our results 
apply also to the complex case, and they are more general than the corresponding statements 
from BBP , We do not restrict the scaling of n, p beyond requiring that they tend to infinity 
together, nor that of £ beyond what is strictly necessary for the existence of a limiting 
distribution in the subcritical or critical regimes. We therefore allow for certain relevant 
possibilities that were previously excluded, namely p <C n and p 3> n. The picture of the 
dependence on the spike is also more complete: we include all intermediate scalings of i with 
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n,p across the subcritical and critical regimes. Separately, we describe a joint convergence 
in law when the same underlying data is spiked with different i. 

Since this article was first posted, Mo (2011) gave a different treatment of the real rank 
one case. Despite the difficulties mentioned, he succeeds with the standard program of ob- 
taining forms for the joint eigenvalue and largest eigenvalue distributions and doing asymp- 
totic analysis on the latter. His description of the limiting distribution naturally looks very 
different from ours. See Forrester (2011) for some remarks on the two treatments and an 
alternative construction of the "general /3" model we now introduce. 

We bypass the eigenvalue density altogether; our starting point is rather a reduction of 
the matrix to tridiagonal form via Householder's algorithm, a well-known tool in numerical 
analysis. Trotter (1984) observed that the algorithm interacts nicely with the Gaussian 
structure, using the resulting forms to derive the Wigner semicircle and Marcenko-Pastur 
laws without going through their moments. Observing the similarity of the forms in the 
(3 = 1,2,4 cases, Dumitriu and Edelman (2002) introduced interpolating matrix ensembles 
for all P > whose eigenvalue density is given by Dyson's Coulomb or log gas model 

where v is the Hermite or the Laguerre weight and Z is a normalizing factor (see Forrester 
2010 for more on such models). Incidentally, Trotter's argument applies to these general (3 
analogues and establishes Wigner semicircle and Marcenko-Pastur laws in this setting. An 
extension to more general weights is part of a forthcoming work of Krishnapur, Rider and 
Virag (2011+). 

The second step is to consider the tridiagonal ensemble as a discrete random Schrodinger 
operator (i.e. discrete Laplacian plus random potential) and then take a scaling limit at 
the soft edge to obtain a certain continuum random Schrodinger operator on the half-line. 
This "stochastic operator approach to random matrix theory" was pioneered by Edelman 
and Sutton (2007), Sutton (2005); in the soft edge case their heuristics were proved by 
Ramirez, Rider and Virag (2011), who in particular established joint convergence of the 
largest eigenvalues. Our method is directly based on the latter work and we refer to it 
throughout by the initials RRV. The key point is that both steps can be adapted to the 
setting of rank one perturbations. As we will see, the limiting operator feels the perturbation 
in the boundary condition at the origin. 

In detail, let X be a p x n sample matrix whose columns are independent real N(0, S) 
with £ = diag(£, 1, . . . , 1) for some £ > 0; we shall say S = XX^ has the £-spiked p-variate 
real Wishart distribution with n degrees of freedom. (There is no loss of generality 
in taking E diagonal in the Gaussian case.) We also consider the complex and quaternion 
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cases. The tridiagonalization is carried out in detail in Section 3. The result is a symmetric 
tridiagonal (n Ap) x (n A p) matrix W^W, where W is a certain bidiagonal matrix with the 
same nonzero singular values as X. Explicitly, W is given by 



X/3(p-l) Xp(n-1) 




1 



X/3(p-2) X/3(n-2) 



(1.2) 



VP 



X/3(p-(nAp)+l) Xj3(n-(nAp)+l) 
X/3(p-(nAp)) 



where (3 = 1,2,4 in the real, complex and quaternion cases respectively and the x-, x' s 
are mutually independent chi distributed random variables with parameters given by their 
indices. In fact (1.2) makes sense for any > 0, and the resulting ensemble W^W is a 
"spiked version" of the /3-Laguerre ensemble of Dumitriu and Edelman (2002); we call it the 
£-spiked /3-Laguerre ensemble with parameters n, p. Such a matrix almost surely has 
exactly n Ap distinct nonzero eigenvalues by the theory of Jacobi matrices. In the null case 
£ = 1, their joint density is (1.1) with the Laguerre weight v(x) = x\ n ~ p \ +1 ~ 2 ' 13 e~ x l x> Q. We 
note that there is an obvious coupling of (1.2) over all i > 0; in the spiked Wishart cases 
it corresponds to the natural coupling obtained by considering X as a matrix of standard 
Gaussians left multiplied by \/£. 

In order to state our results, we now recall the stochastic Airy operator introduced 
by Edelman and Sutton (2007). Formally this is the random Schrodinger operator 



acting on L 2 (IR + ) where b' x is standard Gaussian white noise. RRV defined this operator 
rigorously and considered the eigenvalue problem TLpf = Af with Dirichlet boundary condi- 
tion /(0) = 0. We will consider a general homogeneous boundary condition /'(0) = wf(0), 
a Neumann or Robin condition for w G (— oo, oo) with the limiting Dirichlet case naturally 
corresponding to w — +oo. Precise definitions will be given in Section 2 in a more general 
setting; for now, we write T-Lp, w to indicate the stochastic Airy operator together with this 
boundary condition. 

We will see that, almost surely, %p tV) is bounded below with purely discrete, simple 
spectrum {A < Ai < • • ■ } for all w G (— oo, oo]. This fact will be established simultaneously 
with the standard variational characterization: in Proposition 2.8, we show in particular that 



dx 2 



+ x + 
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A*, and the corresponding eigenfunction f% are given recursively by 

poo POO 

A k = inf / (f(x) 2 + xf 2 (x))dx + wf(0) 2 + ^ g / f(x)db x (1.3) 

/6i 2 , 11/11=1, Jo V ' Jo 

f-Lfo,— ,/fe-i 

in which we consider only candidates / for which the first integral is finite, and the stochastic 
integral is defined pathwise via integration by parts. Recall from RRV that the distribution 
Fp,oo °f — At) * n ^ ne Dhichlet case w = +00 may be taken as a definition of Tracy-Widom(/3) 
for general > 0, a one-parameter family of distributions interpolating between those at the 
standard values (3 = 1,2,4. Fixing (3, the distributions Fp >w for finite w may be thought of 
as a family of deformations of Tracy- Widom(/3). We note that the pathwise dependence of 
Hp t w on the Brownian motion allows the operators to be coupled over w in a natural way. 

Our first result gives a convergence in distribution at the soft edge of the ^-spiked 0- 
Laguerre spectrum over the full range of subcritical and critical scalings. Note the absence 
of extraneous hypotheses on n, p and £ ntP . 

Theorem 1.1. Let £ n , p > 0. Let S = S n>p have the real (resp. complex, quaternion) l n<p - 
spiked p-variate Wishart distribution with n degrees of freedom and set (3 = 1 (resp. 2, 4), 
or, let f3 > and take S n ^ p from the £ n)P -spiked /3-Laguerre ensemble with parameters n,p. 
Writing rn n>p = (n" 1 ! 2 +p -1 / 2 ) 2 ^ , suppose that 

m ntP ^1 — \/n/p(t niV — l) j — > w G (—00, 00] asnAp-^oo. (1.4) 

Let Ai > • • • > A nAp be the nonzero eigenvalues of S . Then, jointly for k = 1,2, ... in the 
sense of finite-dimensional distributions, we have 

2 

-~^= y^k — (yn + y/P) j =>• -Afc-i as n A p ->■ 00 

where Ao < Ai < • • • are the eigenvalues ofHp >w . Furthermore, the convergence holds jointly 
with respect to the natural couplings over all {£ n , p },w satisfying (1.4). 

Remark 1.2. In the tridiagonal basis, the convergence holds also at the level of the corre- 
sponding eigenvectors. If the eigenvector corresponding to is embedded in L 2 (IR + ) as a 
step-function with step width m~j, and support [0, (nApj/mJ, then it converges to fk-i 
in distribution with respect to the L 2 norm; the details are the subject of the next section. 
In particular, distributional convergence of the rescaled tridiagonal operators to 'Kp^ w holds 
in the norm resolvent sense (see e.g. Weidmann 1997). Defining T-Lp }W as a closed operator on 
the appropriate (random) dense subspace of L 2 requires some care, however (see e.g. Savchuk 
and Shkalikov 1999) and we shall not pursue it here. 
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Remark 1.3. The supercritical regime w = — oo sees a macroscopic separation of the largest 
eigenvalue from the bulk of the spectrum; the fluctuations of Ai are on a larger order and they 
are asymptotically Gaussian, independent of the rest. Though known for real and complex 
spiked sample covariance matrices (BBP, Paul 2007, Bai and Yao 2008), existing results do 
not cover intermediate "vanishingly supercritical" scalings of £ with n,p and thus leave a 
certain gap between the critical and supercritical regimes. This gap can be addressed using 
the stochastic Airy framework (Bloemendal 2011 + ). 

Remark 1.4. Work of Feral and Peche (2009) immediately allows extension of the previous 
theorem in the real and complex spiked Wishart cases to more general real and complex 
spiked sample covariance matrices. More precisely, the i.i.d. multivariate Gaussian columns 
of the data matrix X may be replaced with i.i.d. columns having zero mean and rank one 
spiked diagonal covariance, and satisfying some moment conditions. These authors make 
the same assumptions on the dimension ratio as BBP, but the null case universality result 
of Peche (2009) suggest these could be removed. 

We prove Theorem 1.1 by establishing a more general technical result, Theorem 2.10 
in Section 2, The latter theorem gives conditions under which the low-lying eigenvalues 
and corresponding eigenvectors of a large random symmetric tridiagonal matrix converge 
in law to those of a random Schrodinger operator on the half-line with a given potential 
and homogeneous boundary condition at the origin. Verifying the hypotheses for suitably 
scaled spiked Laguerre matrices will be relatively straightforward; we do it in Section 3, The 
approach follows that of RRV, where the null case of Theorem 1.1 is treated. 

One advantage of such an approach is that it immediately yields results for other matrix 
models as well. In particular, finite-rank additive perturbations of Gaussian orthogonal, 
unitary and symplectic ensembles (GO/U/SE) have received considerable attention. 
The analogue of the BBP theorem in the perturbed GUE setting was established by Peche 
(2006), Desrosiers and Forrester (2006). Bassler, Forrester and Frankel (2010) treat an 
interesting generalization and mention some applications to physics. We consider a simple 
additive rank one perturbation of the GOE obtained by shifting the mean of every entry by 
the same constant /i/ y/n. By orthogonal invariance, this has the same effect on the spectrum 
as shifting the (1,1) entry by y/nfi. With this perturbation, the usual tridiagonalization 
procedure works; the resulting form is the (3 = 1 case of 




X/3(n-l) 
V^92 X/3(n-2) 



X/3(n-l) 




1 



(1.5) 



Xp V2g. 
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where the g's are independent standard Gaussians and the x's are independent Chi random 
variables indexed by their parameter as before. The analogous procedure for a shifted mean 
GUE (resp. GSE) yields (1.5) with (3 = 2 (resp. 4). This matrix ensemble is a perturbed 
version of the /3-Hermite ensemble of Dumitriu and Edelman (2002). In the unperturbed 
case /j, — 0, the joint eigenvalue density is (1.1) with the Hermite weight v(x) = e~ x2 ^ 2 . 
Again, the models are naturally coupled over all 

As in the spiked real Wishart setting, the critical regime for the rank one perturbed 
GOE has resisted description. We show that the phase transition in the perturbed Hermite 
ensemble has the same characterization as the one in the Laguerre ensemble. 

Theorem 1.5. Let fi n G R. Let G = G n be a (fi n / ^/n)-shifted mean nxn GOE (resp. GUE, 
GSE) matrix and set (3 = 1 (resp. 2, 4), or, let (3 > and take G n = G^ n as in (1.5). 
Suppose that 

n 1 / 3 (1 — /i n ) — y w G (— oo, oo] as n — >■ oo. (1.6) 

Let Ai > • • • > A n be the eigenvalues of G. Then, jointly for k — 0, 1, . . . in the sense of 
finite- dimensional distributions, we have 

n 1 ^ 6 (Afc — 2y/n) =>■ — Ak-i as n — > oo 

where Ao < Ai < ■ ■ • are the eigenvalues ofHp, w . Furthermore, the convergence holds jointly 
with respect to the natural couplings over all {fi n },w satisfying (1.6). 

Remark 1.6. The remarks following the previous theorem apply also to this theorem; the 
universality issue is discussed in Feral and Peche (2007). 

The limit of a rank one perturbed general f3 soft edge thus seems to be universal, just as 
at /3 — 2. We offer two alternative descriptions. 

Theorem 1.7. Fix (3 > and let A be the ground state energy ofHp yW where w G (— oo, oo]. 
The distribution Fp tW (x) = P ( a )lt) (— Ao < x) has the following alternative characterizations. 

(i) (RRV) Consider the stochastic differential equation 

dp x = ~-}^db x + (x - p 2 x ) dx (1.7) 

and let P( Xo ,w) be the ltd diffusion measure on paths {p x } x > Xo started from p Xo = w. 
A path almost surely either explodes to — oo in finite time or grows like p x ~ \fx as 
x — > oo, and we have 

F/3 jW (x) = P( XjU ,)(p does not explode). (1.8) 
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(ii) The boundary value problem 



F(x, w) — > 1 as x, w —> oo together, 

F(x,w) —7-0 as w — )■ — oo with x bounded above 



1.10) 



aas a unique bounded solution, and we have F^ w (x) = F(x,w) for w G (— oo, oo). We 
recover the Tracy- Widom((3) distribution F^ j00 (x) = lim^^oo F(x, w). 

Remark 1.8. These characterizations can be extended to the higher eigenvalues; details ap- 
pear in Section 4, 

In RRV the diffusion characterization is derived with classical tools, namely the Riccati 
transformation and Sturm oscillation theory. We review the relevant facts in Section 4 before 
proceeding to the boundary value problem. While the latter characterization amounts to a 
straightforward reformulation of the former, it is appealing in that it involves no stochastic 
objects. It also turns out to offer a good way of evaluating the distributions numerically 
(Bloemendal and Sutton 2011+ ). Most interestingly, however, it provides a sought-after 
connection with known integrable structure at (5 = 2, 4. 

To wit, let u(x) be the Hastings-McLeod solution of the homogeneous Painleve 
II equation 

u" = 2u 3 + xu, (1.11) 

characterized by 

u(x) ~ Ai(a;) as x +oo (1-12) 

where Ai(x) is the Airy function (characterized in turn by Ai" = x Ai and Ai(+oo) = 0); it 
is known that there is a unique such function and that it has no singularities on R (Hastings 
and McLeod 1980). Put 

«(s)=rv, (i.i3) 

E(x) = exp(- F(x) = exp(- /». (1.14) 

Next define two functions f(x,w), g(x,w) on IR 2 , analytic in w for each fixed x, by the first 
order linear ODEs 




u 2 —wu 



-wu + u' w 2 — X 




[1.15) 



and the initial conditions 



f(x,0) = E{x) = g(x,0). (1.16) 
Equation (1.15) is one member of the Lax pair for the Painleve II equation. The functions 
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/, g can also be denned in terms of the solution of the associated Riemann-Hilbert problem; 
analysis of the latter yields some information about u, f,g summarized in Facts 5.1 and 5.2 
below. The following theorem expresses the relationship between the objects just defined 
and the general characterization at = 2,4. The proof is given in Section 5. 

Theorem 1.9. The identities 

F 2 , w (x) = f(x,w)F(x), 

FlAx) = A/ + 9 )^ + (/- g )^ Fl/2 

\ 1 J (2 2 /3x, 

hold and follow directly from Theorem 1.7 and Facts 5.1 and 5.2. 

The formula for F 2jW is given by Baik (2006), although it appeared earlier in work of Baik 
and Rains (2000, 2001) in a very different context. The formula for F^ w appears in Baik and 
Rains (2000, 2001) in a disguised form; the w = case is obtained by Wang (2008), but it is 
a new result in this context for w ^ 0, oo. In the = 4 case we thus use our characterization 
to prove a guess. 

In particular, we recover the Painleve II representations of Tracy and Widom at these 
in a novel and simple way. 

Corollary 1.10 (Tracy and Widom 1994, 1996, BBP 2005, Wang 2008). We have 

F(x), (1.19) 
\{E 1 l\x) + E~ 1 /\x))F l / 2 {x), (1.20) 
E x ' 2 {x)F x ' 2 {x). (1.21) 

Remark 1.11. The latter distribution is known to be F ltOQ (x) (Tracy and Widom 1996). 
Unfortunately we lack an independent proof. 

A number of points remain somewhat mysterious. Most obviously, we lack a connection 
in the — 1 case; while the literature previously did not even suggest a guess, it would now 
be illuminating to reconcile (1.9), (1.10) with the formula obtained by Mo (2011). Even at 
— 2, 4 it seems there should be a more direct way to derive or at least understand the 
connection. From the point of view of the PDE (1.9), some kind of extra structure appears 
to be present at certain special values of the parameter 0; what about other values? From 
the point of view of nonlinear special functions, we have shown directly — independently of 
any limit theorems — how the well-studied Hastings-McLeod solution admits characterization 
in terms of a simple linear parabolic boundary value problem in the plane. 

We close this introduction by advertising the sequel, in which we treat the general spiked 
model with analogous methods. 
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(1.17) 
(1.18) 



F2, 0O (x) = 

F4,oc(2~ 2/3 x) = 

FS(x) = F 4i0 (2- 2 / 3 *) = 



2 The limit of a spiked tridiagonal ensemble 



In this section we strengthen the argument of RRV to apply in the rank one spiked cases. 
The main convergence result will be applied in the next section to the tridiagonal forms 
described in the introduction. 

Theorem 2.10 below generalizes Theorem 5.1 of RRV in a natural way, giving conditions 
under which the low-lying eigenvalues and corresponding eigenvectors of a random symmetric 
tridiagonal matrix converge in law to those of a random Schrodinger operator on the half- 
line with a given potential and homogeneous boundary condition at the origin. We include 
substantial parts of the original argument both for completeness and to highlight the new 
material; see Anderson, Guionnet and Zeitouni (2009) for another presentation of the original 
argument in a special case. 

Matrix model and embedding 

Underlying the convergence is the embedding of the discrete half-line Z + = {0, 1, . . .} into 
1R + = [0, oo) via j i — y j/m n , where the scale factors m n — > oo but with m n = o(n). Define 
an associated embedding of function spaces by step functions: 

^n( z +) ^ L 2 (R + ), (vo^t,...) h-> v(x) =v [mnXl , 

which is isometric with £^-norm \\v\\ 2 = m' 1 Yl^jLo v j- Identify IR n with the initial coordinate 
subspace {v G t 2 n : Vj = 0, j > n}. We will generally not refer to the embedding explicitly. 

We define some operators on L 2 , all of which leave i 2 n invariant. The translation operator 
(T n f)(x) = f(x + m~ l ) extends the left shift on The difference quotient D n = m n {T n — 1) 
extends a discrete derivative. Write E n = diag(m n , 0, 0, . . .) for multiplication by m n l^ TO -i), 
a "discrete delta function at the origin", and R n = diag(l, . . . , 1, 0, 0, . . .) for multiplication 
by l[o,n/mn)j which extends orthogonal projection t 2 n — > M. n . 

Let (y n ,i;j)j=o,...,n, i = 1, 2 be two discrete-time real-valued random processes with y n ,i-o = 
0, and let w n be a real-valued random variable. Embed the processes as above. Define a 
"potential" matrix (or operator) 

V n = di&g(D n y nA ) + ±(dmg(D n y nt2 )T n + diag(D n y ni2 )) , 

and finally set 

H n = R n {DlD n +V n + w n E n ). (2.1) 
This operator leaves the subspace R n invariant. The matrix of its restriction with respect to 



12 



the coordinate basis is symmetric tridiagonal, with on- and off-diagonal processes 



m n + (y n ,i-,i + w n )m n , 2m n + (y n ,i-,2 - y n ,i;i)m n} . . . 



(2.2) 

2ml + {Vn,l;n ~ y n ,l;n-l)m n 

~ m l + lyn,2-,im n , -m 2 n + l(y n ,2-,2 - y n ,2;i)m n , . . . , ^ ^ 

-ml + l(y nj 2;n-l - y n ,2;n-2)m n 

respectively. We denote this random matrix also as H n , and call it a spiked tridiagonal 
ensemble. (We could have absorbed w n into y n< i as an additive constant, but keep it 
separate for reasons that will soon be apparent.) 

As in RRV, convergence rests on a few key assumptions on the random variables just 
introduced. By choice, no additional scalings will be required. 

Assumption 1 (Tightness and convergence). There exists a continuous random process 
{y(x)}x>o with y(0) = such that 



{y n ,i{ x )}x>o, i = 1,2 are tight in law, 
y n ,i + y n ,2 =>• y in law 

with respect to the compact-uniform topology on paths. 

Assumption 2 (Growth and oscillation bounds). There is a decomposition 



(2.4) 



k=0 



with T) n i-j > such that for some deterministic unbounded nondecreasing continuous func- 
tions f](x) > 0, ((x) > 1 not depending on n, and random constants K n > 1 defined on the 
same probability spaces, the following hold: The n n are tight in distribution, and for each n 
we have almost surely 

f](x)/K n -K n < Vn,l(x) + 7] ni2 (x) < K n (l+T](x)), (2.5) 

Vn,2(x) < 2m 2 n , (2-6) 

|<*>„,i(0 - Un,l(x)\ 2 + |Wn,2(0 - W„ i2 (x)| 2 < K n (l +T](x) / ((x)) (2.7) 

for all x, £ G [0, n/m n ] with |£ — x\ < 1. 

Assumption 3 (Critical or subcritical spiking). For some nonrandom w G (— oo, oo], we have 

ujfi y %u in probability. (2.8) 
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The necessity of first and third assumptions will be evident when we define a continuum 
limit and prove convergence. The more technical second assumption ensures tightness of 
the matrix eigenvalues; its limiting version (derived in the next subsection) will guarantee 
discreteness of the limiting spectrum. Lastly, we note that for given y n the models may be 
coupled over different choices of w n . 

Reduction to deterministic setting 

In the next subsection we will define a limiting object in terms of y and to; we want to prove 
that the discrete models converge to this continuum limit in law. We reduce the problem 
to a deterministic convergence statement as follows. First, select any subsequence. It will 
be convenient to extract a further subsequence so that certain additional tight sequences 
converge jointly in law; Skorokhod's representation theorem (see Ethier and Kurtz 1986) 
says this convergence can be realized almost surely on a single probability space. We may 
then proceed pathwise. 

In detail, consider (2.4)-(2.8). Note in particular that the upper bound of (2.5) shows 
that the piecewise linear process {f Vn,i} >0 is tight in distribution under the compact- 
uniform topology for i = 1,2. Given a subsequence, we pass to a further subsequence so 
that the following distributional limits exist jointly: 

Un,i ^ Hit 

f Vn,i =► vl (2-9) 

for % = 1,2, where convergence in the first two lines is in the compact-uniform topology. 
We realize (2.9) pathwise a.s. on some probability space and continue in this deterministic 
setting. 

We can take the bounds (2. 5), (2. 7) to hold with K n replaced with a single constant k. 
Observe that (2.5) gives a local Lipschitz bound on the J r/ nj j, which is inherited by their limits 
rj\. Thus rji = (77!) is defined almost everywhere on E + , satisfies (2.5), and may be defined 
to satisfy this inequality everywhere. Furthermore, one easily checks that m" 1 ^ Vn,i f Vi 
compact-uniformly as well (use continuity of the limit). Therefore co n ^ = y n> i — m~ l YlVn^ 
must have a continuous limit u,i for i — 1,2; moreover, the bound (2.7) is inherited by the 
limits. Lastly, put 77 = rji + r/ 2 , u = uj\ + u 2 and note that yi = J r/i + Ui and y — J 77 + lo. 

Without further reference to the subsequences, we will assume this situation for the 
remainder of the section. 
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Limiting operator and variational characterization 

Formally, the limit of the spiked tridiagonal ensemble H n will be the eigenvalue problem 

Hf = Af on R + 

(2.10) 

f (0) = u,/(0), /(+oo) = 

where H = —d 2 /dx 2 + y'(x) and w G (— oo, oo] is fixed. If w = +oo, the boundary condition 
is to be interpreted as /(0) = 0; we refer to this as the Dirichlet case, and it will require 
special treatment in what follows. The primary object for us will be a symmetric bilinear 
form associated with the eigenvalue problem (2.10). 

Define a space of test functions consisting of smooth functions on R + with compact 
support that may contain the origin, except in the Dirichlet case. Denote by ||-|| and (•, •) 
the norm and inner product of L 2 [0, oo). Define a weighted Sobolev norm by 

11/11* = ll/f+llA/T+Wlf 

and an associated Hilbert space L* as the closure of under this norm. Note that our L* 
differs slightly from the one in RRV, We register some basic facts about L* functions. 

Fact 2.1. Any f G L* is uniformly Holder ( 1/2)- continuous, satisfies \f{x)\ < \\f\\^ for all 
x, and in the Dirichlet case has /(0) = 0. 

Proof. We have \f(y)-f(x)\ = \f% f'\ < \\f'\\ \y - x\ 1/2 . For / G C °° we have f(x) 2 = 
— (f 2 )' < 2 \\f'\\ ll/H < ||/||^; an L*-bounded sequence in Cq° therefore has a compact- 
uniformly convergent subsequence, so we can extend this bound to / G L* and conclude 
further that /(0) = in the Dirichlet case. □ 

For future reference, we also record some compactness properties of the L*-norm. 

Fact 2.2. Every L*-bounded sequence has a subsequence converging in the following modes: 
(i) weakly in L* , (ii) derivatives weakly in L? , (Hi) uniformly on compacts, and (iv) in L? . 

Proof, (i) and (ii) are just Banach-Alaoglu; (iii) is the previous fact and Arzela-Ascoli again; 
(iii) implies L 2 convergence locally, while the uniform bound on J fjf 2 produces the uniform 
integrability required for (iv). Note that the weak limit in (ii) really is the derivative of 
the limit function, as one can see by integrating against functions l[o jX ] and using pointwise 
convergence. □ 

We introduce a symmetric bilinear form on x by 

U y , w (<P,i>) = (v',f)-((W)',y) + wp(0)^(0), (2.H) 
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dropping the last term in the Dirichlet case. (We could have absorbed w into y as an additive 
constant in the finite case, but prefer to keep the boundary term separate.) Formally, 
'Hy,w( ( f, f) is just (tfiyHf); notice how the mixed boundary condition is built "implicitly" 
into the form, while the Dirichlet boundary condition is built "explicitly" into the space. 

Lemma 2.3. There are constants c,C > so that the following bounds holds for all f G : 

c\\f\\l-C\\f\\ 2 < HyMf) <C\\f\\l (2.12) 

In particular, T-L VlW {-, •) extends uniquely to a continuous symmetric bilinear form on L* x L* 
satisfying the same bounds. 

Proof. For the first two terms of (2.11), we use the decomposition y — J 77 + w from the 
previous subsection. Integrating the J r] term by parts, the limiting version of (2.5) easily 
yields 

i||/|l!-C"||/|| 2 < \\ff + (f 2 ,v) < K\\f\\l 

Break up the u term as follows. The moving average uj x = f* +1 u is differentiable with 
u' x = u x+ \ — u x \ writing uj = u + (u — u;), we have 

~((f)',u) = (f,cj'f)+2(f,(cJ-co)f). 

The limiting version of (2.7) gives max(|a;£ — u x \ , — u x \ 2 ) < C £ + efj{x) for |£ — x\ < 1, 
where e can be made small. In particular, the first term above is bounded absolutely by 
£ ll/ll! + ||/|| 2 - Averaging, we also get \u x — u x \ < (C £ + efj{x)) 1 / 2 ; Cauchy-Schwarz then 
bounds the second term above absolutely by y/e J °° (f) 2 + ^ J °° f 2 {C £ + erf) and thus by 
\ft ll/ll! + C£ll/I| 2 - Now combine all the terms and set e small to obtain the result. 

For the boundary term wf(0) 2 , it suffices to obtain a bound of the form /(0) 2 < e \\f\\l + 
C'^WfW 2 . But/(0) 2 < 2 ||/' || 11/11 from the proof of Fact 2.1 gives such a bound with C'J = l/e. 

The L* form bound follows from the fact that the L*-norm dominates the L 2 -norm. We 
obtain the quadratic form bound [Hy,w{f,f)\ < C \\f\\l; it is a standard Hilbert space fact 
that it may be polarized to a bilinear form bound (see e.g. Halmos 1957). □ 

Definition 2.4. Call (A,/) an eigenvalue-eigenfunction pair if / G L*, \\f\\ = 1, and 
for all if G we have 

H V}Vl (<p,f)=A{<p,f). (2.13) 

Note that (2.13) then automatically holds for all (p G L*, by L*-continuity of both sides. 

Remark 2.5. This definition represents a weak or distributional version of the problem (2.10). 
As further justification, integrate by parts to write the definition 

{</, /') - ((<pf)\ y) + w v»(0)/(0) = A (<p, f) 
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in the form 



f) - W, fy) + Jo fy) - */(«) i> = - A Jo /> > 

which is equivalent to 

fix) = wf(0) + y(x)f(x) - [ X yf - A f/ a.e. x. (2.14) 

Jo Jo 

In the Dirichlet case the first term on the right is replaced with f(0). On the one hand (2.14) 
shows that /' has a continuous version, and the equation may be taken to hold everywhere. 
In particular, / satisfies the boundary condition of (2.10) at the origin. On the other hand, 
(2.14) is a straightforward integrated version of the eigenvalue equation in which the potential 
term has been interpreted via integration by parts. This equation will be useful in Lemma 2.7 
below and is the starting point for a rigorous derivation of (1.7) in the stochastic Airy case. 

Remark 2.6. The requirement / G L* in Definition 2.4 is a technical convenience. Regarding 
regularity, we need / at least absolutely continuous to make sense of the eigenvalue equation 
in either an integrated or a distributional sense; we have seen, however, that solutions are 
in fact C 1 . Regarding behaviour at infinity, the diffusion picture developed by RRV shows 
a dichotomy: almost all solutions of the eigenvalue equation grow super-exponentially at 
infinity, except for the eigenfunctions which decay sub-exponentially. 

We now characterize eigenvalue-eigenfunction pairs variationally. It is easy to see that 
each eigenspace is finite-dimensional: a sequence of normalized eigenfunctions must have an 
L 2 -convergent subsequence by (2.12) and Fact 2.2, By the same argument, eigenvalues can 
accumulate only at infinity. In fact, more is true: 

Lemma 2.7. For each A G R, the corresponding eigenspace is at most one- dimensional. 

Proof. By linearity, it suffices to show a solution of (2.14) with f'(0) = /(0) = must vanish 
identically. Integrate by parts to write 

fix) = y{x) f'f - [ X yf - Ax f f + A ftf(t)db, 
Jo Jo Jo Jo 

which implies that l/'^)! < C(x) J* \ f\ with some C(x) < oo increasing in x. Gronwall's 
lemma then gives f(x) = for all x > 0. □ 

The eigenfunction corresponding to a given eigenvalue is thus uniquely specified with the 
additional sign normalization — | < arg(/(0), /'(0)) < |. We order eigenvalue-eigenfunction 
pairs by their eigenvalues. As usual, it follows from the symmetry of the form that distinct 
eigenfunctions are L 2 -orthogonal. 
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Proposition 2.8. There is a well-defined (k + l)st lowest eigenvalue-eigenf unction pair 
(Afc,/fc); it is given recursively by the minimum and minimizer in the variational problem 

/-L/oi— i/fc-i 

Remark 2.9. Since we must have — » oo, essentially {A , Ai, . . .} exhausts the full spectrum 
and the operator has compact resolvent. We do not make this precise. 

Proof. First taking k = 0, the infimum A is finite by (2.12). Let f n be a minimizing sequence; 
it is L*-bounded, again by (2.12). Pass to a subsequence converging to / G L* in all the 
modes of Fact 2.2, In particular 1 = ||/ n || — » ||/||, so Hy tW (f, f) > A by definition. But also 

n y , w (fJ) = \\f'\\ 2 + f f 2 V + (f^'f)+2(f,(iJ-io)f)+wf(0) 2 
< liminf Hy }W (f n , f n ) 

by a term-by-term comparison. Indeed, the inequality holds for the first term by weak 
convergence, and for the second term by pointwise convergence and Fatou's lemma; the 
remaining terms are just equal to the corresponding limits, because the second members of 
the inner products converge in L 2 by the bounds from the proof of Lemma 2.3 together with 
L*-boundedness and L 2 -convergence. Therefore T-L y;W (f, f) = A. 

A standard argument now shows (A, /) is an eigenvalue-eigenfunction pair: taking ip G 
C£° and e small, put f £ = (f + eip)/\\f + eip\\; since / is a minimizer, ^| £ _ 'Hj /) , i ;(/ e , f e ) 
must vanish; the latter says precisely (2.13) with A. Finally, suppose (A, g) is any eigenvalue- 
eigenfunction pair; then H y , w (g,g) = A, and hence A < A. We are thus justified in setting 
A = A and f = f. 

Proceed inductively, minimizing now over {/ e L* : ||/|| = 1, / _L / , . . . , /fc-i}- Again, 
L 2 -convergence of a minimizing sequence guarantees that the limit remains admissible; as 
before, the limit is in fact a minimizer; conclude by applying the arguments of the previous 
paragraph in the ortho-complement. The preceding lemma guarantees that Ao < Ai < • • • , 
and that the corresponding eigenfunctions f , fi, . . . are uniquely determined. □ 

Statement 

We are finally ready to state the main result of this section. When we speak of an eigenvalue- 
eigenvector pair (A, v) of an n x n matrix, we take v G lR n embedded in L 2 (IR + ) as usual 
and normalized by ||t>|| = 1 and — | < arg(t> ,t>i) < |. 
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Theorem 2.10. Suppose that H n as in (2.1) satisfies Assumptions 1-3 and let {\ n ,ki v n,k) 
be its (k + l)st lowest eigenvalue-eigenvector pair. Define the corresponding form l-L ytW as 
in (2.11) and let (A*, fk) be its a.s. defined (k + l)st lowest eigenvalue- eigenf unction pair. 
Then, jointly for all k = 0,1,... in the sense of finite-dimensional distributions, we have 
X n) k =>• Afc and v n ^ =^l 2 fk as n — > oo. The convergence holds jointly over different w n ,w 
for given y n ,y. 

Remark 2.11. Essentially, the resolvent matrices (precomposed with the corresponding finite- 
rank projections) are converging to the continuum resolvent in L 2 -operator norm. We do 
not define the resolvent operator here. 

The proof will be given over the course of the next two subsections. Recall that we 
proceed in the subsequential almost-sure context of the previous subsection. 

Tightness 

We will need a discrete analogue of the L*-norm and a counterpart of Lemma 2.3 with 
constants uniform in n. For v G M. n , define the L*-norm by 

ii2 II ||2 

D n v\\ + \\v a/1 + rj\\ if w < oo, 

-Dn^|| 2 + \\v a/1 + f/|| 2 + W n VQ if w = oo, 

noting that the additional term in the Dirichlet case is nonnegative for sufficiently large n. 

Remark 2.12. As in the continuum version, the Dirichlet boundary condition must be put 
explicitly into the norm (see also Lemma 2.15 below). The case considered in RRV has 
w n = m n in our notation; though it is somewhat hidden in the definitions, the L*-norm used 
there contains a term m n v\. 

Lemma 2.13. There are constants c,C > so that, for each n and all v G W 1 , 

c||^L 2 n -C||^|| 2 <^,iJ n t;)<C||^||^. (2.15) 

Proof. The derivative and potential terms may be handled exactly as in RRV (proof of 
Lemma 5.6). For the spike term w n VQ we recall Assumption 3. In the w < oo case the w n are 
bounded, so it suffices to obtain a bound of the form Vq < e \\v\\l n + C e \\v || 2 for each e > 
where e, C £ do not depend on n. Mimicking the continuum version in the proof of Fact 2.1, 
we have 

v 2 = (-D n v 2 ,l) = (-(D n v)(T n v + v),l) < (-(D n v),T n v + v) <2\\D n v\\ \\v\\ , 
which gives the desired bound with C £ = 1/e. 
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In the Dirichlet case, start with (2.15) but with the spike term left out (both of the form 
and the norm); it can be easily added back in by simply ensuring that c < 1 and C > 1. □ 

Remark 2.14. If w n — > — oo then the lower bound in Lemma 2.13 breaks down: the lowest 
eigenvalue of H n really is going to — oo. This is the supercritical regime. 

Convergence 

We begin with a lemma, a discrete-to-continuous version of Fact 2.2. 

Lemma 2.15. Let f n £ IR n with \\f n \\^ n uniformly bounded. Then there exist f £ L* and 
a subsequence along which (i) f n — > f uniformly on compacts, (ii) f n — >^2 f, and (Hi) 
D n f n —> f weakly in L 2 . 

Proof. Consider g n (x) = f n (0) + D n f n , a piecewise-linear version of /„; they coincide at 
points x = i/m n , i £ Z + . One easily checks that ||g n ||* < 2 ||/ n ||*„) so some subsequence 
g n — > f £ L* in all the modes of Fact 2.2; in the Dirichlet case, the extra term in the L* norm 
guarantees that /(0) = 0. But then also f n —>f compact-uniformly by a simple argument 
using the uniform continuity of /, /„ — ^2 / because \\f n — g n \\ 2 < (l/3n 2 ) ||D n / n || 2 , and 
D n fn — > f weakly in L 2 because D n f n = g' n a.e. □ 

Next we establish a kind of weak convergence of the form (-,H n -) to H y>w (-,-). Let 
V n be orthogonal projection from L 2 onto W 1 . One can check the following: for / £ L 2 , 
V n f -*l 2 f (the Lebesgue differentiation theorem gives pointwise convergence and we have 
uniform L 2 -integrability); for smooth /, V n f — > f uniformly on compacts; further, if /' £ L 2 
then D n f — f (D n f is a convolution of /' with an approximate delta). Observe that V n 
commutes with R n and with D n R n . 

Lemma 2.16. Let f n — ^ / be as in the hypothesis and conclusion of Lemma 2.15. Then for 
all ip £ we have (cp, H n f n ) — > H y ^ w ({p, f). In particular, V n ip — >■ ip in this way and so 

(V n (p, H n V n if) = {if, H n V n if) M y , w {(p, <p) ■ (2.16) 

Proof. Note that if /„ —^ip. f, g n is L 2 -bounded and g n — >■ g weakly in L 2 , then (f n ,g n ) — > 
(f,g). Therefore (ip, D n f n } = (D n tp, D n f n ) —> (<p',f). The potential term converges as 
in RRV (proof of Lemma 5.7). Moreover, the spike term converges to the boundary term: 

w n f n (0)(Vn<p)(0)^wf(P)<p(0), 

where in the Dirichlet case the left side vanishes for n large because ip is supported away 
from 0. 
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For the second statement, the uniform L* n bound follows from the following observations: 
V n (p)\/1 + r/ 1 1 = [["Pn^A/l + 77 1| < ||<^a/1 + f° r n large enough that R n tp = if we have 
H-Dn'PnV 9 !! = H^nA^H < II^Wll < Wf'W (Young's inequality); and in the Dirichlet case, the 
extra term vanishes for n large. The convergence is easy: V n (p — > (p compact-uniformly and 
in L 2 , and for g G L 2 we have (g, D n V n ^p) = (P n g, D n ip) — > (g, <p') . □ 

Finally, we recall the argument of RRV to put all the pieces together. 

Proof of Theorem 2.10. First we show that for all k we have \ k = liminf X n ^ > A^. Assume 
that X k < 00. The eigenvalues of H n are uniformly bounded below by Lemma 2.13, so 
there is a subsequence along which (A n) i, . . . , A n> fc) — > (£1, . . . = A fc ). By the same lemma 
the corresponding eigenvector sequences have L*-norm uniformly bounded; pass to a further 
subsequence so that they all converge as in Lemma 2.15, The limit functions are orthonormal, 
and by Lemma 2.16 they are eigenfunctions with eigenvalues There are therefore k distinct 
eigenvalues at most A fc , as required. 

We proceed by induction, assuming the conclusion of the theorem up to k — 1. First find 
fl G C£° with ||/| — < s. Consider the vector 

fc-i 

fn.k = V n fk ~ ( Vn *> Vn fk) V n,j- 

The L*-norm of the sum term is uniformly bounded by Cs: indeed, the || uniformly 
bounded by Lemma 2.13, while the coefficients satisfy \(v n j, /|)| < ||/| — fk\\ + \\ v n,j — fj\\ < 
2e for large n. By the variational characterization in finite dimensions, and the uniform 
L* n form bound on (-,H n -) (Lemma 2.13) together with the uniform bound on H'Pn/IILn 
(Lemma 2.16), we then have 

1. \ ^ T \fn,k)H n f n k) vnfki Hnl^nffc) , . . . 

hmsup A n>fc < hmsup — — — = hmsup +o e (l), (2.17) 

\Jn,ki Jn,k) \' nfki ' nJk) 

where o e (l) — > as e — >■ 0. But (2.16) of Lemma 2.16 provides lim (V n f k , H n V n ft) = 
Hy,w(fk: fk)i so the right hand side of (2.17) is 

Hy,w(fkl fk) , /-|\ _ Hy tW (fk, fk) /, N _ . / t x 

(/fe> It) \fk, fk) 

Now letting e — > 0, we conclude lim sup A nj fc < A/,.. 

Thus \ n) k — > Afc; Lemmas 2.13 and 2.16 imply that any subsequence of the v n ^ has a 
further subsequence converging in L 2 to some g G L* with (A&, g) an eigenvalue-eigenfunction 
pair. But then g = fk-, and so u n fc -^l 2 fk- □ 
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3 Application to Wishart and Gaussian models 



We now apply Theorem 2.10 to prove Theorems 1.1 and 1.5. The first step is to obtain the 
tridiagonal forms. Then, after recalling the derivation of the scaling limit at the soft edge, 
we verify Assumptions 1-3 for certain scalings of the perturbation. 

Tridiagonalization 

We explain how to tridiagonalize a rank one spiked real Wishart matrix; the algorithm is 
basically the usual one described by Trotter (1984) with a few careful choices. We restrict for 
the moment to the case n> p, but lift this restriction in the Remark 3.1 below. For a given 
p x n data matrix X we will construct a pair of orthogonal matrices O G 0(p), O' G 0(n) 
so that W = OXO' becomes lower bidiagonal; then X and W have the same singular values 
and WW^ is a symmetric tridiagonal matrix with the same eigenvalues as XX' . Further, 
the structure of X and O, O' will be such that the entries of W are independent with explicit 
known distributions. 

We build up O and O' as follows. Let ei, . . . , e p G MP be the standard basis of column 
vectors and §i, . . . , e n G M. n the standard basis of row vectors. 

• First, reflect (or rotate) the top row of X into the positive e\ direction via right mul- 
tiplication by 0[ G 0(n), chosen independently of the other rows. This row becomes 

XnZi, where Xn is a Chi(n) random variable (i.e. distributed as the length of an 
n-dimensional standard normal vector); the other rows remain independent standard 
normal vectors, since their distribution is invariant under an independent reflection. 

• Next, reflect the first column of XO[ as follows: leaving (ei) invariant, reflect the 
orthogonal (e 2 , . . . ,e p ) component of the column into the positive e 2 direction via left 
multiplication by Ox G {I\} © 0(p — 1), chosen independently of the other columns. 
This component of the column becomes Xp-i e 2 where Xp-i ~ Chi(p— 1), independent of 
Xn- The same components of the other columns remain independent standard normal 
vectors, while the first row is untouched. 

• Reflect the second row of 0\XO[ as follows: leaving (ei) invariant, reflect the orthog- 
onal component of the row into the positive e 2 direction via right multiplication by 
0' 2 G {ii} © 0(n — 1), chosen independently of the other rows. 

• Reflect the second column of 0\XO' x O' 2 as follows: leaving (ei, e 2 ) invariant, reflect the 
orthogonal component of the column into the positive e% direction via left multiplication 
by 2 G {/ 2 } © 0(p — 2), chosen independently of the other columns. 

• Continue in this way, alternately reflecting rows and columns while leaving the results 
of previous steps untouched. 
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The result is that with O' = 0[- ■ ■ O' and O = O p _i • • • 0\ we have 



n 



Xp— 1 Xn— 1 



W = 0X0' 



X2 Xn-p+2 



Xl 



Xn-p+l 



where {Xn-j} P = and {x P -j} P j= i 



are independent Chi random variables of parameters given 



by their indices. We have truncated the n — p rightmost columns of zeros to obtain a p x p 
matrix, leaving the product WW* 1 unchanged. We will actually work with W^W below, 
which has the same eigenvalues. 

Remark 3.1. Attempting the above procedure in the case n < p produces a lower bidiagonal 
matrix W with n + 1 nonzero rows. The matrix W^W is now n x n, has the same nonzero 
eigenvalues as XX\ and looks just like it does in the n > p case except for a discrepancy in 
the bottom- right corner. The two cases may in fact be unified if one agrees that Xo — 0; then 
W is (nAp + 1) x (nAp) and has the form (1.2) with j3 = 1, while W^W is (nAp) x (nAp). 

The same algorithm will tridiagonalize a rank one spiked complex (resp. quaternionic) 
Wishart matrix by unitary (resp. symplectic or hyperunitary) conjugations. The lower bidi- 
agonal matrix will be Wjf^ from (1.2) with (3 = 2 (resp. 4). 

The perturbed GOE/GUE/GSE ensembles are even easier to tridiagonalize; as in the 
Wishart case, the usual procedure of Trotter (1984) works without modification. Starting 
with annxn GOE matrix M with a perturbation in the (1,1) entry, the upshot is that for cer- 
tain 0i, . . . , O n _i with Oj G {Ij} ®0{n — j) the conjugated matrix O n _i • • • OiMOj • • • 0\_ x 
has the form (1.5) with /? = 1. We do not detail it further here. 

Scaling limit 

Consider the ^-spiked /3-Laguerre ensemble S = W^W with W = W n:P = wi;} n ' p as in (1.2), 
recalling that S njP is (nAp) x (nAp). The diagonal and off-diagonal processes of (3S are 



respectively. The usual centering and rescaling for fluctuations at the soft edge — as well as 
the operator limit itself — can be predicted using the approximations 



' — Z I Z Z I Z Z I Z 

n,pXf3n ' Xp(p-1)7 Xp(n-1) ' X/3(p-2)i X/3(n-2) ' Xj3{p-Z)i 



X/3(n-l)Xl3(p-l) 



Xp(n-2)X/3(p-2), 



xl w k + V2kg, 
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valid for k large, where g is a suitably coupled standard Gaussian. We briefly reproduce the 
heuristic argument. 

To leading order, the top-left corner of S has n + p on the diagonal and y/np on the 
off-diagonal. So the top-left corner of 




is approximately an unsealed discrete Laplacian. If time is scaled by m" 1 , space has to be 
scaled by m 2 for this to converge to J^. The next order terms for the j'th diagonal and 
off-diagonal entries of S, where j ; <C n A p, are respectively 

^(V2ng n _ j+1 + ^/2pg p -j -2j), 
^p{Vp/^9n-j + \fnj2g P -j - l/2(y/p/n + y/n/p)j). 

(we have indexed the g's to match the corresponding x' s )- The total noise per unit (unsealed) 
time is like Sni^/n + y/p)g\ convergence to -fg times standard Gaussian white noise b' x 
then requires {\/n + y/p)m 2 J \frvp = m 1 / 2 . The averaged part of the potential requires 
(2 + \/p~Jn+ y/n/p\ vr? I -yjnp = m _1 to converge to the function —x. Fortunately these two 
scaling requirements match perfectly; we set 

m ^ = (v^T^) ' Hn ' p = ^§(^ + ^ InAp ~ Sn ' p ) 

and set the integrated limiting potential to 

y(x) = \x 2 + -^b x 

where b x is a standard Brownian motion. Note that 

2" 2/3 (nAp) 1/3 < m < (nAp) 1/3 , 

so the conditions m — > 00, m = o(n A p) are met by merely having n,p — > 00 together. 

We now carefully decompose H np as in (2.1). In (2. 2), (2. 3) there is a little freedom 
between y n ,i-i and w n , but only in to an additive constant in y n> i that tends to zero in 
probability anyway. Thus we may as well set y n .i-i = to fix w n and y n ^. Assumptions 1 
and 2 (the CLT (2.4) and required tightness (2.5)-(2.7) for the potential terms y n> i) are then 
verified exactly as in the final subsection of RRV, 
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It remains to consider Assumption 3. We have 



i i i n (1 o X ?A [p ft x i(p-v 

w n = m n , p | 1 + J-[ 1 - £ n ,p^r- ) + \ I ~ I 1 



p \ ' fin J V n V &V 
First order heuristics suggest we take l n , P to satisfy 

w n = m„ iP ^1 + ^J^- (1 — Cp)^ — > w G (— oo, oo] as n A p — > oo 

as in (1.4). We want to show that, in this case, w„ -)• w in probability; it is certainly enough 
to show that w n — w n — > in probability. 

Second order heuristics say the error terms are on the order [n Ap)^ 1 ^ 6 or m -1 / 2 , and L 2 
estimates easily provide the rigour. All we need is that xt has mean k and variance 2k. We 
have 

— m ^ I 2 n \ 171 ( n, -,\ 2 \ 171 

W n -W n = —z-j= [Xpn - M + 1T~J= \P\P ~ l ) ~ Xp(p-1)) + 



P^/np y/Xpn ' > (3^np v ^ ' ^rvp 

Using that t < 1 + 2^/pfn, the mean square of the first term is 0(m 2 jp + m 2 /n), which is 
0(mT l \ The mean square of the second term is 0(m 2 /n), again 0(m _1 ). The last term is 
negligible. This completes the proof of Theorem 1.1. 

Turning now to the perturbed /3-Hermite ensemble, take G n = G^ n as in (1.5). With 
heuristic motivation similar to that in the previous proof, set 



m„ = n 



1/3 



2 

m 



H n =^ {2^hl n - G n ) 



and y(x) as before. Decompose H n as in (2.1). Again, the verification of Assumptions 1 and 
2 on y n ^ proceeds as in RRV (Lemmas 6.2, 6.3). Moving on to Assumption 3, we have 

w n = m n (l - (/i n + ^j2/f3ng l ) 

Putting 

W n = m n (1 - /in) 



as in (1.6), the difference is w n — w n = —n 1 / 6 A /2//3 g\. It follows that w n — w n — > in 
probability, which completes the proof of Theorem 1.5. 

4 Alternative characterizations of the laws 

In this section we prove Theorem 1.7 and its extension to higher eigenvalues. 
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Diffusion 

The diffusion characterization is developed in RRV, The starting point is an application of 
the classical Riccati map p = f'/f to the eigenvalue equation (2.10), or rigorously to (2.14); 
the result is the first order differential equation 

p'(x) = x - X + -3pb'(x) - p 2 (x) (4.1) 

understood also in the integrated sense. The boundary condition at the origin becomes the 
initial value 

p(0) = w, 

and a zero of / would have p explode to — oo and immediately restart at +oo. 

One can in fact construct the solution for any A G R. One way to see this is to introduce 
the variable q(x) = p(x) + ^b(x); the ODE 

q' = x-\-(q + ^b) 2 (4.2) 

is classical and the Picard existence and uniqueness theorem applies. Although solutions can 
explode to — oo in finite time, this is not a problem if we consider the values on the projective 
line. Behaviour through oo can then be understood in the other coordinate q — 1/q, which 
evolves as 

q'= (l + -|6g) 2 -(x-A)g 2 ; 

in particular, q' = 1 whenever q = 0. The solution can thus be continued for all time. 
Moreover, it depends monotonically and continuously on the the parameter A, uniformly 
on compact time-intervals with respect to the topology of the projective line. Following 
classical Sturm oscillation theory one can argue that almost surely, for all A £ K, the number 
of eigenvalues strictly less than A equals the number of explosions of p on M. + . 

For a fixed A, the Riccati equation (4.1) may also be understood in the Ito sense; by 
translation equivariance the time-shift i^i-A produces the same path measure as the 
Ito diffusion (1.7) started at time x = —A. Writing K( XOjWq ^ for the distribution of the first 
explosion time of p x under P( xo , wo ) — an improper distribution with some mass on oo — we 
have P / g iW (A < A) = K(_ A)W )(IR) or Fp >w (x) = ({oo}) as in (1.8). More generally, the 
strong Markov property gives 

P/?,™(-Afc_i > x) = / K( XjW )(dXx) K(xi,oo)(dX2) • ■ • K(x k - lt oo)(dx k ). (4.3) 

The stated path properties of (1.7) appear also in RRV (Propositions 3.7 and 3.9). 
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Boundary value problem 



Briefly, the boundary value problem is just the Kolmogorov backward equation for a hitting 
probability of the diffusion. We assume the diffusion representation Fp tW (x) = K( x , w )({oo}) 
for the distribution of — A . 

Lemma 4.1. For each fixed x, Fp )W (x) is nondecreasing and continuous in w G (oo, oo] and 
tends to zero as w — )■ — oo. 

Remark 4.2. There are in fact almost-sure counterparts of these assertions that describe how 
Ao depends on w for each Brownian path, but we do not need them here. 

Proof. The monotonicity is a consequence of uniqueness of the diffusion path from each 
space-time point: two paths started from (x,Wo) and (x,wi) with wq < W\ never cross, so if 
the upper path explodes to — oo then the lower path must do so as well. The continuity is a 
general property of statistics of diffusions: K( XiPx \({oo}) is a martingale, so Fp iVI (x) is in fact 
space-time harmonic. (Again, the behaviour at w = +00 may be understood by changing 
coordinates.) 

The final assertion is that for fixed xq explosion becomes certain as w — > —00. It may be 
verified by a domination argument involving the ODE (4.2) (time-shifted as above so that 
A = and the initial time is xq), whose paths explode simultaneously with those of (1.7). 
Given e > 0, let M be such that P(sup a . e r a . 0ja . 0+1 i |6 X | > M) < e. It is easy to check that 
for r sufficiently negative, the solution of r' = x — (r + M) 2 with initial value r(x ) = r 
explodes to —00 before time Xq + 1. Now consider the solution of q' = x — (q + 6) 2 with 
9(^0) < r o < —M. With probability 1 — e we have q'{x) < r'(x) whenever q(x) = r(x), so 
the paths never cross and q explodes as well. □ 

Proof of Theorem 1.1 (ii). Writing L for the space-time generator of the SDE (1.7), the PDE 
(1.9) is simply the equation LF = 0. Therefore the hitting probability F(x,w) = Fp jW (x) 
satisfies the PDE. The boundary behaviour (1.10) follows from Lemma 4.1 and the fact that 
F(-,w) is a distribution function for each w. Specifically, the lower part of the boundary 
behaviour follows from the fact that F(x, w) is increasing in x and F(x, w) — > as w — > 
—00 for each x. The upper part follows from the fact that F(x, w) is increasing in w and 
F(x, w) — > 1 for fixed w as x — > 00. 

Toward uniqueness, suppose F(x,w) is another bounded solution of (1.9), (1.10). By the 
PDE, F(x,p x ) is a local martingale under P( XOtWO ) and thus a bounded martingale. Let T be 
the lifetime of the diffusion; optional stopping gives F(x,w) = E( x , w ) F(T A t,prAt) f° r ah 
t > x. Taking t — > 00, we conclude by bounded convergence, the boundary behaviour of F 
and the stated path properties of the diffusion that F(x, w) is the non-explosion probability. 
That is, F = F. □ 
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As promised, we indicate how the laws of the higher eigenvalues A 1; A 2 , . . . may be char- 
acterized in terms of the PDE (1.9). The characterization is inductive and follows from (4.3) 
by reasoning just as in the preceding proof. 

Theorem 4.3. Let F^(x,w) = Pp yW {— A < x). For each k = 1,2, . . the boundary value 
problem 

{1 as x, w oo together, 

F(k-i)( x o, +oo) as w — > — oo while x — > xq £ R 

/ias a unique bounded solution F^, and we have Pp )W (— A k < x) — F/^(x,w) for w £ 
(-oo, oo); further, Pp )00 (-A k < x) = lim™-^ F( fc) (x, w). 



5 Connection with Painleve II 

We now prove Theorem 1.9 and Corollary 1.10. We will need some standard facts about the 
function u(x) defined by (1.11), (1.12) and the derived functions v(x), E(x), F(x) defined 
in (1.13), (1.14). 

Fact 5.1. The following hold: 

(i) u > on IR and u'/u ~ —\fx as x — >■ +oo. 

(ii) E and F are distribution functions. 

(Hi) E(x) = 0(e~ cx3/2 ) for some c > as x — )■ +oo. 

We will also take for granted some additional information about the functions f(x,w), 
g(x,w) defined by (1.15), (1.16). 

Fact 5.2. The following hold. 

(i) For each x £ K ; 

lim I T <5-d 



w — ^ — oo 




lim = . (5.2) 



(m^ For eac/i u> £ 





(5.3) 
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(Hi) There is the identity 

g{x, w) = f{x, -w)e>"- xw . (5.4) 

(iv) For fixed weR, 

f(x,w) — >1 as x — >■ +00; (5.5) 
f(x,w) > for x sufficiently negative. (5.6) 

These properties follow from an analysis of the associated Riemann-Hilbert problem with 
the special monodromy data corresponding to the Hastings-McLeod solution (see Fokas, Its, 
Kapaev and Novokshenov 2006). They are proved in Baik and Rains (2001) except for (iv) 
which goes back to Deift and Zhou (1995). Interestingly (1.16) and (5.1) are interchangeable 
in that the latter also uniquely determines a solution of (1.15); this fact does not depend on 
the specific solution of (1.11) specified by (1.12). By contrast, (5.2) does depend on (1.12). 
Equations (1.15), (5. 3) constitute a so-called Lax pair for the Painleve II equation (1.11). (It 
is in fact a simple transformation of the standard Flaschka-Newell Lax pair.) The consistency 
condition of this overdetermined system — i.e. that the partials commute — is the Painleve II 
equation. 

Proof of Theorem 1.9, (3 = 2 case. Let F 2 (x,w) denote the right-hand side of (1.17). Us- 
ing (1.14), (1.15) and (5.3), we check that that F 2 solves the PDE (1.9) with (3 = 2: compute 

{vf + ug^F 
{u 2 f + (-wu-u')g^F 

I (w 4 + w 2 u 2 - (u) 2 ) f+(-u+ (wu + v!){x - w 2 ))g^F 
and substitute. The coefficient of g vanishes and the coefficient of / is 

v + u 4 — (u 1 ) 2 + xu 2 . 

Differentiating, we see that this quantity is constant by (1.11). As all terms vanish in the 
limit as x — > 00, the constant is zero. 

We must check that F 2 is bounded and that it has the boundary behaviour (1.10). To 
this end we claim /, g > on M 2 . Fixing w, (5. 6), (5. 4) cover x sufficiently negative. Now 
(5.3) shows / increases at least until xq = min{x : g(x,w) = 0}. But if xq exists then 
(5.3) shows |^ (xq) > 0, a contradiction. This proves the claim. It now follows from (5.3) 
that 7^ > 0. From (5.5) we deduce that / < 1; in particular / is bounded, and hence so 
is F 2 . Furthermore, for a given x G E and e > 0, (5.1) yields w + such that / > 1 — e on 



dF 2 
dx 

dF 2 
dw 
d 2 F 2 

dw 2 
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[x, oo) x [w + ,oo), and (5.2) yields w_ such that / < e on (— oo,x] x (—00, iw_]. Using that 
F(x) is a distribution function, (1.10) follows. □ 



Proof of Theorem 1.9, (3 = 4 case. That the right-hand side F4 of (1.18) satisfies the PDE 
(1.9) with (3 = 4 may be verified just as in the (3 = 2 case; the computation is more tedious 
but the result is very similar and the final step is the same. 

It is a little more work to get boundedness and the boundary behaviour (1.10) this time. 
Dropping the scale factors on x,w, consider 

G = F~ l / 2 h = \ (E~ 1/2 + E 1 ' 2 ) f + \ (E' 1 ' 2 - E l ' 2 )g. 

Clearly G > 0. For fixed w, G ->■ 1 as x ->■ 00 by (5.5) and the fact that E~ 1/2 - E 1/2 = 
0(e~ cx3/2 ) while g = 0(e wx ) from (5.4). Now by (5.3) we have 

^ = 1 (E- l/2 + E 1/2 ) (lug) + § (E^ 2 - E^ 2 ) (\uf - W g) , 

which is positive for w < 0. Boundedness in the lower half-plane {w < 0} follows, as does 
the lower boundary behaviour using (5.2). 

From (5.4) we immediately see g < 1 on {x > 0, < w < \^3x}. By Lemma 4.1, 
^Fp }W (x) > 0. The (3 = 2 case of the present theorem then implies that ^ > 0. From 

(1.15) we conclude g < u/(w + u'/u) provided the denominator is positive. But u'/u ~ — y/x 
as x — > +00, so there is X\ such that u'/u > —\/2x for x > X\. The latter bound for g 
therefore implies that g is bounded on {x > x±, w > \/3x}. Moreover, for any xq < X\ we 
have that u and u'/u are bounded on the interval Xq < x < x±, so g is bounded uniformly 
over these x for all w sufficiently large. Putting these bounds together we conclude g is 
bounded on all right half-planes {x > xq}, and the same then follows for F4. 

The upper boundary behaviour follows as well. Indeed, as x,w — > 00 together the 
coefficient of g vanishes while the coefficient of / tends to 1; the g-term then vanishes while 
the /-term tends to 1 as in the (3 = 2 case. 

It remains to show F4 is bounded on the whole plane; it suffices to bound F4 on the 
upper-left quadrant Q = {x < 0, w > 0}. Here we can use the fact that F 4 solves the 
PDE. With notation as in Theorem 1.7 we have that ^(x,^) is a local martingale under 
P(x ,w )- By boundedness on right half-planes, it is in fact a bounded martingale. Using 
that paths explode only to —00, optional stopping gives the representation F 4 (xo,wo) = 
E(x ,«;o) ^(TjPt) where T = inf{x : (x,p x ) Q}. The bound thus extends to Q. □ 

Proof of Corollary 1.10. These identities are straightforward consequences of the theorem, 

(1.16) and (5.1). □ 



30 



Acknowledgements The second author is very grateful to Jose Ramirez for conversations 
that helped this project go forward. The first author is indebted to Alexander Its for his 
patient and thorough explanations. We would like to thank Jinho Baik, Alexei Borodin, 
Peter Forrester, Arno Kuijlaars, Eric Rains, Brian Rider, Brian Sutton, Dong Wang and 
Ofer Zeitouni for interesting and helpful discussions, as well as AIM and MSRI for providing 
stimulating environments in December 2009 and September 2010 workshops. The work of 
the first author was supported in part by an NSERC postgraduate scholarship held at the 
University of Toronto, and that of the second author by the Canada Research Chair program 
and the NSERC DAS program. 

References 

Anderson, C, Guionnet, A. and Zeitouni, O. (2009). An Introduction to Random Matrices, 
Cambridge University Press. 

Anderson, T. W. (1963). Asymptotic theory for principal component analysis, Ann. Math. 
Statist. 34: 122-148. 

Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, third edn, 
Wiley-Interscience . 

Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, 
a review, Statist. Sinica 9: 611-677. 

Bai, Z. D. and Silverstein, J. W. (1998). No eigenvalues outside the support of the limit- 
ing spectral distribution of large-dimensional sample covariance matrices, Ann. Probab. 
26: 316-345. 

Bai, Z. D. and Silverstein, J. W. (1999). Exact separation of eigenvalues of large-dimensional 
sample covariance matrices, Ann. Probab. 27: 1536-1555. 

Bai, Z. and Yao, J.-f. (2008). Central limit theorems for eigenvalues in a spiked population 
model, Ann. Inst. Henri Poincare Probab. Stat. 44: 447-474. 

Baik, J. (2006). Painleve formulas of the limiting distributions for nonnull complex sample 
covariance matrices, Duke Math. J. 133: 205-235. 

Baik, J., Ben Arous, G. and Peche, S. (2005). Phase transition of the largest eigenvalue for 
nonnull complex sample covariance matrices, Ann. Probab. 33: 1643-1697. 

Baik, J. and Rains, E. M. (2000). Limiting distributions for a polynuclear growth model 
with external sources, J. Statist. Phys. 100: 523-541. 

Baik, J. and Rains, E. M. (2001). The asymptotics of monotone subsequences of involutions, 
Duke Math. J. 109: 205-281. 



31 



Baik, J. and Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of 
spiked population models, J. Multivariate Anal. 97: 1382-1408. 

Bassler, K. E., Forrester, P. J. and Frankel, N. E. (2010). Edge effects in some perturbations 
of the Gaussian unitary ensemble, J. Math. Phys. 51: 123305, 16. 

Ben Arous, G. and Corwin, I. (2011). Current fluctuations for TASEP: a proof of the 
Prahofer-Spohn conjecture, Ann. Probab. 39: 104-138. 

Benaych-Georges, F. and Nadakuditi, R. R. (2009). The eigenvalues and eigenvectors of 
finite, low rank perturbations of large random matrices, arXiv:0910.2120v2. 

Bloemendal, A. (201 1+). In preparation. 

Bloemendal, A. and Sutton, B. D. (2011+). In preparation. 

Deift, P. A. and Zhou, X. (1995). Asymptotics for the Painleve II equation, Comm. Pure 
Appl. Math. 48: 277-337. 

Desrosiers, P. and Forrester, P. J. (2006). Asymptotic correlations for Gaussian and Wishart 
matrices with external source, Int. Math. Res. Not. 2006: Art. ID 27395, 43 pp. 

Dumitriu, I. and Edelman, A. (2002). Matrix models for beta ensembles, J. Math. Phys. 
43: 5830-5847. 

Edelman, A. and Sutton, B. D. (2007). From random matrices to stochastic operators, J. 
Stat. Phys. 127: 1121-1165. 

El Karoui, N. (2003). On the largest eigenvalue of Wishart matrices with identity covariance 
when n, p, and p/n — > oo, arXiv:math/0309355vl. 

El Karoui, N. (2007). Tracy- Widom limit for the largest eigenvalue of a large class of complex 
sample covariance matrices, Ann. Probab. 35: 663-714. 

Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence, 
John Wiley & Sons, Inc. 

Feral, D. and Peche, S. (2007). The largest eigenvalue of rank one deformation of large 
Wigner matrices, Comm. Math. Phys. 272: 185-228. 

Feral, D. and Peche, S. (2009). The largest eigenvalues of sample covariance matrices for a 
spiked population: diagonal case, J. Math. Phys. 50: 073302, 33 pp. 

Fokas, A. S., Its, A. R., Kapaev, A. A. and Novokshenov, V. Y. (2006). Painleve Transcen- 
dents: The Riemann-Hilbert Approach, American Mathematical Society. 

Forrester, P. J. (1993). The spectrum edge of random matrix ensembles, Nuclear Phys. B 
402: 709-728. 

Forrester, P. J. (2010). Log-gases and Random Matrices, Princeton University Press. 



32 



Forrester, P. J. (2011). Probability densities and distributions for spiked Wishart (3- 
ensembles, arXiv:1101.2261vl. 

Geman, S. (1980). A limit theorem for the norm of random matrices, Ann. Probab. 8: 252- 
261. 

Halmos, P. (1957). Introduction to Hilbert Space and the Theory of Spectral Multiplicity, 
Chelsea Publishing Co. 

Harding, M. (2008). Explaining the single factor bias of arbitrage pricing models in finite 
samples, Economics Letters 99: 85-88. 

Hastings, S. P. and McLeod, J. B. (1980). A boundary value problem associated with the 
second Painleve transcendent and the Korteweg-de Vries equation, Arch. Rational Mech. 
Anal. 73: 31-51. 

Johansson, K. (2000). Shape fluctuations and random matrices, Comm. Math. Phys. 
209: 437-476. 

Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components 
analysis, Ann. Statist. 29: 295-327. 

Johnstone, I. M. (2007). High dimensional statistical inference and random matrices, Inter- 
national Congress of Mathematicians. Vol. /, Eur. Math. Soc, Zurich, pp. 307-333. 

Krishnapur, M., Rider, B. and Virag, B. (20 11+). In preparation. 

Marcenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of 
random matrices, Mat. Sb. (N.S.) 72 (114): 507-536. 

Mo, M. Y. (2011). The rank 1 real Wishart spiked model, arXw:1101.51Uvl. 

Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory, John Wiley & Sons Inc. 

Onatski, A. (2008). The Tracy- Widom limit for the largest eigenvalues of singular complex 
Wishart matrices, Ann. Appl. Probab. 18: 470-490. 

Patterson, N., Price, A. L. and Reich, D. (2006). Population structure and eigenanalysis, 
PLoS Genetics 2: el90. 

Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covari- 
ance model, Statist. Sinica 17: 1617-1642. 

Peche, S. (2006). The largest eigenvalue of small rank perturbations of Hermitian random 
matrices, Probab. Theory Related Fields 134: 127-173. 

Peche, S. (2009). Universality results for the largest eigenvalues of some sample covariance 
matrix ensembles, Probab. Theory Related Fields 143: 481-516. 

Ramirez, J. A., Rider, B. and Virag, B. (2011). Beta ensembles, stochastic Airy spectrum, 
and a diffusion, J. Amer. Math. Soc. 24: 919-944. 



33 



Savchuk, A. M. and Shkalikov, A. A. (1999). Sturm-Liouville operators with singular poten- 
tials, Math. Notes 66: 897-912. 

Silverstein, J. W. and Bai, Z. D. (1995). On the empirical distribution of eigenvalues of a 
class of large-dimensional random matrices, J. Multivariate Anal. 54: 175-192. 

Soshnikov, A. (2002). A note on universality of the distribution of the largest eigenvalues in 
certain sample covariance matrices, J. Statist. Phys. 108: 1033-1056. 

Sutton, B. D. (2005). The Stochastic Operator Approach to Random Matrix Theory, PhD 
thesis, Massachusetts Institute of Technology. 

Telatar, E. (1999). Capacity of multi-antenna Gaussian channels, Europ. Trans. Telecom. 
10: 585-595. 

Tracy, C. A. and Widom, H. (1994). Level-spacing distributions and the Airy kernel, Comm. 
Math. Phys. 159: 151-174. 

Tracy, C. A. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles, Comm. 
Math. Phys. 177: 727-754. 

Trotter, H. F. (1984). Eigenvalue distributions of large Hermitian matrices; Wigner's semi- 
circle law and a theorem of Kac, Murdock, and Szego, Adv. in Math. 54: 67-82. 

Wang, D. (2008). Spiked Models in Wishart Ensemble, PhD thesis, Brandeis University. 
arXiv:0804.0889vl. 

Weidmann, J. (1997). Strong operator convergence and spectral theory of ordinary differen- 
tial operators, Univ. Iagel. Acta Math. 34: 153-163. 

Yin, Y. Q., Bai, Z. D. and Krishnaiah, P. R. (1988). On the limit of the largest eigenvalue of 
the large-dimensional sample covariance matrix, Probab. Theory Related Fields 78: 509- 



521. 



Alex Bloemendal 
Department of Mathematics 
Harvard University 
Cambridge, MA 02138 



Balint Virag 

Departments of Mathematics and Statistics 

University of Toronto 

Toronto ON M5S 2E4, Canada 



alexb@math . harvard . edu 



balint@math . toronto . edu 



34 



